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Abstract. A statistical thermodynamic development is given of a new implicit sol- 
vent model that avoids the traditional system size limitations of computer simulation 
of macromolecular solutions with periodic boundary conditions. This implicit solvent 
model is based upon the quasi-chemical approach, distinct from the common integral 
equation trunk of the theory of liquid solutions. The idea is geometrically to define 
molecular-scale regions attached to the solute macromolecule of interest. It is then 
shown that the quasi-chemical approach corresponds to calculation of a partition func- 
tion for an ensemble analogous to, but not the same as, the grand canonical ensemble 
for the solvent in that proximal volume. The distinctions include: (a) the defined prox- 
imal volume — the volume of the system that is treated explicitly — resides on the 
solute; (b) the solute conformational fluctuations are prescribed by statistical thermo- 
dynamics and the proximal volume can fluctuate if the solute conformation fluctuates; 
and (c) the interactions of the system with more distant, extra-system solution species 
are treated by approximate physical theories such as dielectric continuum theories. 
The theory makes a definite connection to statistical thermodynamic properties of the 
solution and fully dictates volume fluctuations, which can be awkward in more ambi- 
tious approaches. It is argued that with the close solvent neighbors treated explicitly 
the thermodynamic results become less sensitive to the inevitable approximations in 
the implicit solvent theories of the more distant interactions. The physical content 
of this theory is the hypothesis that a small set of solvent molecules are decisive for 
these solvation problems. A detailed derivation of the quasi-chemical theory escorts 
the development of this proposal. The numerical application of the quasi-chemical 
treatment to Li + ion hydration in liquid water is used to motivate and exemplify the 
quasi-chemical theory. Those results underscore the fact that the quasi-chemical ap- 
proach refines the path for utilization of ion-water cluster results for the statistical 
thermodynamics of solutions. 



INTRODUCTION 



Direct simulation of macromolecules in aqueous solutions typically requires con- 
sideration of a mass of solution large compared to the mass of the macromolecular 
solute. Frequently, the bulk of the solution is of secondary interest. The extrava- 
gant allocation of computational resources for direct treatment of macromolecular 
solutions limits the scientific problems that may be tackled. Thus, implicit solvent 
models that eliminate the direct presence of the solvent in favor of an approximate 
description of the solvation effects have received universal and extended interest 
[1-53]. 

These issues are specifically relevant to this workshop on electrostatic interac- 
tions in solution for two reasons. First, electrostatic interactions exacerbate the 
difficulties of solute size mentioned above. If all interactions were short-ranged, 
most practitioners would be satisfied with the traditional approach, adopting peri- 
odic boundary conditions and empirically examining the system size dependence of 
their results by performing calculations on successively larger systems. Second, the 
additional computational requirements to treat genuinely chemical phenomena in 
solution by in situ electronic structure calculations again limits the problems that 
can be addressed. In fact, the principal physical concepts involved in electronic 
structure calculations for chemical problems in liquid water universally involve elec- 
trostatic interactions of long range. The important example of metal ion chemistry 
in proteins combines these points. 

Although a variety of implicit solvent models have been tried by now in numer- 
ous selected applications, they are still limited in fundamental aspects. A helpful 
recent discussion of important limitations was given by Juffer and Berendsen [20]. 
They emphasize that implicit solvent models should be significantly simpler than 
explicit models in view of the approximations and ad hoc features that are accepted. 
Solution chemistry problems that require direct treatment of electronic degrees of 
freedom are one kind of problem that must be made significantly simpler to per- 
mit a broader computational attack. A striking example is the current 'ab initio' 
molecular dynamics calculations of aqueous solutions of simple ions. They do treat 
electronic structure issues within simulations but have been typically limited to 
total system sizes of 16 - 32 water molecules in periodic boundary conditions [54], 
small systems by current standards with classical simulation models. These small 
sizes do limit the conclusions that might be drawn; the structure and dynamics 
of the second hydration shell of a Li + ion in liquid water, the example discussed 
below, undoubtedly requires calculation on systems larger than 32 water molecules. 
Nevertheless, much can be learned from the study of such small systems, partic- 
ularly when chemical effects of the interactions of a solute with proximal solvent 
molecules are the issues of greatest importance [55]. 

The idea for the developments presented here is aggressively to adopt the Juffer- 
Berendsen suggestion that implicit solvent models must be significantly simpler 



than explicit models and apply that philosophy to the statistical thermodynamic 
treatment of aqueous solutions. To that end, we acknowledge that some water 
molecules play a specific, almost chemical, role in these hydration phenomena. We 
then work out the theory that permits inclusion of a small number of such molecules 
explicitly. The required theory is a descendent of the quasi-chemical approximations 
of Guggenheim [56], Bethe [57], and Kikuchi [58]. It is a significant simplification 
of direct simulation calculations. Roughly described, that theory organizes and 
justifies treatment of a handful of water molecules essentially as ligands of the 
macromolecule of interest, letting the more distant solution environment be treated 
by simple physical approximations such as the popular dielectric continuum models. 

The plan of this presentation is first to introduce an example, the hydration 
of the Li + ion, that permits a convenient discussion of the quasi- chemical theory. 
That example illustrates the quasi-chemical pattern for the theory, exemplifies the 
basic molecular information required to construct quasi-chemical predictions, and 
offers a simplified derivation based upon a thermodynamic model. Following that 
we give an extended theoretical development of the quasi-chemical organization of 
calculations of solvation free energies and then use those theoretical results to sug- 
gest an explicit-implicit solvent model for statistical thermodynamic calculations 
of solutions. 




FIGURE 1. The minimum energy structure of a Li + ion with six water molecules. The inscribed 
ball identifies the inner shell occupied by the four nearest water molecules. The two further water 
molecules are in an outer shell. This illustrates the definition of the bonding or inner shell region. 



EXAMPLE: THE LI+ ION IN WATER 



The hydration of atomic ions provides a conceptually simple context in which 
to consider the quasi-chemical approaches developed below. The previous study 
of the hydration of ferric ion, Fe 3+ (aq), provides one such example [59]. Here we 
motivate the formal developments of the theory by considering the hydration of 
the Li + ion in dilute aqueous solution and we will discuss the principal theoretical 
structures in this context first. We anticipate some of the discussion below by 
noting that Li + (aq) proves to be a difficult case in some important respects. Thus, 
we expect to return to this example in later work. Indeed, the goal of the theoretical 
developments initiated here is of a sufficiently constructive nature that all pieces of 
the puzzle needn't become available at the same time! 



Quasi- chemical Structure for the Hydration Free Energy 



The quasi-chemical theory suggests expressing the chemical potential of the 
lithium ion species, Hu+, m terms of ideal and non-ideal, or interaction, contri- 
butions: 



/3fx Li + = In 



PLi 



q Li+ /V 



lnpo — m 



Ph 2 o 



n>0 



(1) 



with f3~ l = RT. In the last contribution, only a limited number of terms are in- 
cluded in the sum. That limited number is the maximum number of water molecules 
that may occur as inner hydration shell ligands. For the Li + ion example, that lim- 
ited number need not be greater than six (6). Fig. 1 shows the minimum energy 
structure of the Li + ion with six water molecules [60]. 

The coefficients K n of Eq. (1) will be defined more fully later. A virtue of this 
approach is that the natural initial approximation is K n m the equilibrium 

ratio for the reactions that form various inner shell clusters 



Li + + nH 2 0^ Li + (H 2 0)r 



(2) 



without consideration of medium effects, as for an ideal gas. Those coefficients can 
be assembled within the harmonic approximation for the cluster vibrations utilizing 
standard electronic structure computational packages [61]. That idea is reinforced 
by the results in Tables 1, 2, and 3, which we use to develop this example [60]. 

Within the quasi-chemical approximation of collective phenomena [58], the fac- 
tors of solvent density that appear explicitly in Eq. (1) can be recognized as mean- 
field estimates of the influence of the solvent on the chemical potential of the 
Li + (aq) ion. That form of the solvent mean field, however, derives from specific 
compositional effects. For the realistic circumstances that long-ranged interactions 
of the classic electrostatic type are important, the approximation K n should 
be revised with implicit solvent theories intended to describe the influence of the 



TABLE 1. Electronic energy (E), zero-point energy (ZPE), the inter- 
action part of the chemical potential (fi*) for Li(H 2 0)„ + complexes, 
and the Mulliken charge for the Li + ion in each complex utilizing the 
B3LYP and dielectric continuum solvation approximations. Geome- 
tries and ChelpG partial charges were obtained with basis 6-31++G** 
for O, H, and 6-31G* for Li+. The Mulliken partial charges were ob- 
tained with the smaller basis 6-31G on all atoms in order to simplify 
their consideration. Dielectric radii for all atoms were taken from 



Rcf. [62] except R ii+ =2.0 A. 
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TABLE 2. Ideal gas thermochemistries (kcal/mole) at T=298.15 K and p=l atm, 
using the B3LYP functional [63]. The first column gives the electronic energy con- 
tribution, the second the zero-point energy, the third the energy, the fourth the 
enthalpy, and the fifth column is the Gibbs free energy, all at 298.15 K. 
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more distant solution environment on the chemical potential of the Li + (aq) ion. 
This hints at the strategy for the quasi-chemical approaches. Neighbors occupying 
a carefully defined proximal volume are treated explicitly. But more distant neigh- 
bors are treated implicitly with the view that they can be satisfactorily treated 
with simple theories and, in any case, those more distant effects are a smaller part 
of the whole. 

The quantity -lnpo of Eq. (1) gives the free energy in units required to open 
this defined proximal volume in the solvent in order that the construction of the 
complexes of Eq. (2) can be pursued. This quantity is an object of research in the 
theory of liquids [64] in its own right. For specific applications an approximate form 
must be used. In the present example of the hydration of the Li + ion, this packing 
contribution is expected to be of a secondary size, provided the thermodynamic 



TABLE 3. Construction of aqueous solution thermochemical results 
(kcal/mole) at T=298.15 K. The first column gives the gas-phase free en- 
ergy from Table 2 for the sequential hydration reactions, the second reports 
the reaction net free energy after adjustment for the actual concentration of 
liquid water, and the third column gives the interaction part of the chemical 
potential change, A^* = V* Li{H20)n + -Mh 2 o -/ u Li(ff 2 0) n _ 1 +' for n =b2, ■ ■ - 6 > 
approximated with the dielectric model. The fourth column is the total free 
energy change for the indicated reaction in aqueous solution, the sum of the 
second and third columns. To construct the results of Fig. 2, use fi* Li+ w 
-82.0 kcal/mol, the Born model result for R Li +=2.0 A. 
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state is not varied too broadly. 

Finally, the quantity q Li + of Eq. (1) is the canonical partition function for a single 
Li + in a volume V at temperature T [65]. The initial term of Eq. (1) is recognized 
as the ideal (no interactions) contribution to fiu+- Thus, the second and third 
terms there express the interactions of the Li + ion with the solvent and we will 
refer to these contributions as (3A/i Li +. 

As a point of reference, for the present example of the Li + at infinite dilution in 
water under the normal conditions of T=298.15 K and pw=l g/cm 3 , we calculate 
the quasi-chemical contributions (the last term) to Eq. (1) to be -128 kcal/mole 
using R,£j+=2.0A. The experimental values for the total A/X£j+ are -114 kcal/mol 
[66] and -125 kcal/mol [67]. 1 The contributions of repulsive interactions will raise 
that calculated value toward the experimental whole, but it is unlikely to raise it 
enough to exceed the highest of these values. Though the agreement is satisfactory, 
it seems likely that the calculated result is too low, meaning that the Li + ion is 
less well bound to liquid water than the current calculation predicts. Additionally, 
the clusters have been treated in the harmonic approximation in obtaining this 
theoretical value [60]. 

A Thermodynamic Model 

One physical view of the quasi-chemical approach is the following: Li + ions in 
aqueous solution can, in principal, be polled for the number n of water molecule 



^ The value found in [67] was lowered by RT In [RT/ (1 atm liter/mol) ] = 1.9 kcal/mol to convert 
to the standard state adopted here. 
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FIGURE 2. Free energies for Li + (aq) ion hydration in liquid water plotted as a function of the 
number, n, of inner shell water neig hbors, T=298.15 K; see Table 3. The results marked AG (0) 
are the free energies predicted by the electronic structure calculations for the reaction Li + + 
nH^O = Li(H20)„ + under standard ideal conditions, including p = 1 atm. The minimum value 
is at n=4. The solid circles labeled -RT In x n + A/i Li + are the final, net values that include the 
PH 2 o n factors of Eq. (1) adjusting the water molecule concentration from the ideal case to the 
actual concentration of liquid water, and the extra-cluster interaction contributions of Eq. (36). 
The final results predict that the n=4 inner sphere structure is most probable in liquid water 
under normal conditions. 



ligands each possesses. We could determine the concentrations of Li + ions having 
precisely n water molecule ligands. In fact, the n th term in the sum of Eq. (1) 
is proportional to those concentrations. Denoting the mole fraction of Li + ions 
with n water molecule ligands by x n , successive terms there can be expressed as 
x n exp(— (3Afi Li +) [68]. Those mole fractions, however, are not independent ther- 
modynamic variables but are governed by the principles of chemical equilibrium. 
In our example of Li + (aq), the quasi-chemical theory predicts the mole fractions of 
Li + ions with n water molecule ligands, as is shown in Fig. 2. 

These ideas can be the basis of an unrefined but almost thermodynamical deriva- 
tion of the pattern in Eq. (I). 2 To develop this 'derivation', we first make explicit 



2 ' This argument is conceptually simplest and most simply expressed for the case of a chemically 
distinct solute at infinite dilution. So we limit this presentation to that case. 



that we consider the concentrations of the Li(H 2 0)+ complexes of different size n 
despite the fact that it is the concentration of Li + ions that can be experimentally 
manipulated. Thus, we introduce a new set of composition variables that refer to 
the ion complexes, 

N Li+ = N Li(H20) +, 
n>0 

N H2 = Nh 2 + U ^Li{H 2 0)t ■ ( 3 ) 

Here the total number of each species in solution, Nn+ and Nh 2 o, is expressed in 
terms of N Li , H20 ^,+ , the numbers of complexes of different sizes n, and Nh 2 o, the 
number of water molecules not complexed to an ion. 

For the problem initially addressed in terms of variables Nh 2 o and N Li +, these 
relations provide a translation to express the problem in terms of an alternative set 
of variables such as Nh 2 o and N Li ^ H20 ^+. In the formal definition of the Gibbs free 

energy G, for example, with these changes of variables the multiplier of N Li ^ H2Q ^+ 

becomes (fiu+ +nfin 2 o)- If the molecule number N Li , H20 -,+ could be varied inde- 
pendently of other such composition variables, we would identify this multiplier as 
the chemical potential conjugate to N Li ^ H20 ^+. In contrast, for the problem initially 

formally addressed in terms of variables like Nh 2 o and N Li ^ H20 ^+, as we do here, 
then the relative concentrations, x n = N Li ^ H20 - ) +/N Li +, provide a translation to the 
problem expressed in terms of the standard variables such as Nh 2 o and N Li +. The 
coefficient of N Li + in the Gibbs free energy expression is 

J2 X n (&Li(H 2 0)+ ~ n ^H 2 o) = VLi+, (4) 
n>0 

the thermodynamic chemical potential conjugate to Nn+. The differences within 
the parenthesis depend on relative concentrations of complexes of different sizes. 
These relative concentrations are established by nontrivial conditions of equilib- 
rium. 

To show the consequences of those conditions of equilibrium, we adopt the stan- 
dard form of the chemical potential, 3 

fiHe = In [p,jV/q a ] , (5) 

so that 



3 ) In the limit of low density, or if the interactions are neglected, then q a is the canonical partition 
function for one molecule (or complex) of type a in volume V at temperature T. With a more 
general understanding of the divisor of p a , the form asserted for the ideal case is more generally 
valid. It is with the molecular identification of these factors that the present derivation goes 
beyond a purely thermodynamic analysis. We will return to this point below. 



PP-Li+ = X n ln 



n>0 



pLi+Xn{qH 2 o/V) n 

PH 2 o n {q Li{H2 o)tl V ) 



(6) 



The final essential step in this thermodynamic model is the quasi-chemical step 
according to which the relative concentrations of the ion complexes of different 
sizes are expressed as a function of the chemical equilibrium constants, 



k2» 



PH 2 



m>0 

Consistent with the notation above, chemical equilibrium requires that 



(7) 
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IV) ( q H 2 o/V) n 



(8) 



The K(f) are a function of the temperature T as the only thermodynamic variable. 
When the results above are collected, we remember that the x n are non-negative 
weights that sum to one and we notice that qu+ = qu(H 2 o) + so that we can 
separate the correct ideal contribution to pu+- Thus the chemical potential of 
the lithium ion separates into ideal and non-ideal contributions, with the latter 
contribution written in terms of the equilibrium ratios K^ ) that appeared in Eq. (1), 



f3p Li + = ln 



Phi 



(q Li+ /V) 



-ln 



Ti>0 



(9) 



Within the simplified context of this derivation, this is the result that was sought. 

The previous equation should be compared with Eq. (1). Note that it doesn't 
address the initial packing contribution -lnpo, or the other extra-cluster interaction 
contributions incorporated into K n of Eq. (1). Furthermore, it finesses the recogni- 
tion issue of defining cluster structures for counting. Nevertheless, we can note that 
the quasi-chemical step, Eq. (7), would arise in a pedagogical context by making 
the free energy stationary with respect to variations in the progress of chemical re- 
actions. Because of this variational aspect these approaches are sometimes referred 
to as cluster-variation theories [58]. 



Variations of Hydration Free Energies 

Predictions of variations of hydration free energy changes with temperature, 
pressure, composition, and solute geometry are the foremost practical weaknesses 
of dielectric continuum models for hydration free energies. An important facet of 
this issue is that the radii parameters in the dielectric models, which are initially 
established fully empirically, depend on solution thermodynamic conditions such 



as temperature, pressure, and composition, and on solute conformation [69-73]. 
The Li + (aq) case gives direct example of this difficulty: the partial molar volume 
of the Li + (aq) is negative, -6.4 cm 3 /mol [66]. The magnitude of this important 
thermodynamic quantity is not explicable simply on the basis of the Born model 
[66] and the sign cannot be rationalized either if any reasonable description of 
excluded volume effects is included. At this point, considerable further evolution of 
dielectric models typically occurs with additional empirical parameters [66,74]. In 
contrast, an important feature of the present approach is that explicit contributions 
to the temperature and density dependences of the chemical potential are readily 
available. This facilitates investigation of thermodynamic derivatives. 



Pressure Variation of Hydration Free Energies 



The variation of the hydration free energies with pressure is the partial molar 
volume and gives direct information on hydration structure. Thus, it is particularly 
valuable in learning about the hydration process. Recalling that the chemical 
potential expression used in the current approach is written explicitly in Eq. (1), 
the partial molar volume (dV / dN Li +)\ f3p Nr q is 



V L i 
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dp L i+ 
d(3p 
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P,n H2 o,n u+ 



Here we are interested exclusively in the conditions of infinite dilution of the aqueous 
solution so that 



lim v Li + = (kt/P) 



Ph 2 o 



fd/3 Ap 



Li+ 



V dp H2 o 



where k t = (— 1/V) (dV/dp) T is the isothermal coefficient of bulk compressibility 
of the pure solvent. Upon differentiation, the first two terms of Eq. (1) produce 
the partial molar volume of a hard inert sphere of the right size at infinite dilution; 
call it vq. As indicated above this is a topic of separate study [75]. Thus, 



lim v Li 



vo ~ (ph 2 oRTk t ) 



d 



In 



n>0 



[12) 



If the initial approximation K n m is retained, or more generally if the extra- 
complex, or implicit solvent parts are only weakly density dependent, then the 
derivative indicated in Eq. (12) can be evaluated explicitly. Within this approxi- 
mate view, the formula in Eq. (12) can be given a direct interpretation. We can 
parse the final term in Eq. (12) as 



Av = (K T p H2 oRT) v H2 o n 



(13) 



where vh 2 o — V ' Ph 2 o is the partial molar volume of the pure solvent (water) and 



The leading factor p H2 qRTkt appropriately converts a density change to a pressure 
change. The following factors assess the volume change per ligand and the number 
of ligands. This interpretation doesn't work in this specific sense if the density 
dependence of the coefficients in Eq. (14) is significant. (It is interesting to note, 
however, the more general interpretation given by References [76,77] to the partial 
molar quantities.) 

We find n = 4.0, as demonstrated in Fig. 2 in the present example of the 
Li + at infinite dilution in water under the normal conditions of T=298.15 K and 
Pw=l g/cm 3 . As depicted in Fig. 3, the value n = 4.0 has been corroborated by 
'ab initio' molecular dynamics calculations [60]. Combined with the other factors, 
n = 4.0 produces a contribution of -4.9 cm 3 /mole to the partial molar volume of 
Eq. (13). The simplest idea for estimation of the non-electrostatic contributions 
is to use the experimental value for the partial molar volume of He as a solute in 
water, 14.8 cm 3 /mole [78], making the total predicted partial molar volume of Li + 
t>£j+=9.9 cm 3 /mole. Therefore, this pain-staking accounting of the effects through 
the first hydration shell, detailed though it is, does not satisfactorily explain the 
partial molar volume of Li + in liquid water. Since Li + (aq) is known to have a 
strongly structured second hydration shell [79-88], use of the dielectric model for 
the interactions of the Li + with the outer hydration shells is a concern and may 
account for the difference between the calculated and experimental partial molar 
volumes. 

We note that Eq. (13) assumes that the implicit contributions to the pressure 
variation are negligible. Use of the Born model directly without an empirical at- 
tribution of pressure dependence to the Born radii but with the measured pressure 
dependence of the solvent dielectric constant [89] contributes -0.3 cm 3 /mol, a neg- 
ligible magnitude here. The density dependences of hydration free energies for the 
auto-dissociation reaction 2H 2 # H 3 + + OH~ in water were studied some time 
ago [71] from the point of view of a dielectric continuum model with the conclusion 
that a non-trivial density dependence was required of the dielectric model by the 
experimental data. In contrast, the formula Eq. (13) is properly insensitive to radii 
parameters assigned to the solute. 



The temperature variation of the hydration free energy is the partial molar en- 
tropy and, because of its interpretation as an indicator of disorderliness, is of wide 




(14) 



Temperature Variation of Hydration Free Energies 




FIGURE 3. A structure from 'ab initio' molecular dynamics calculations based upon a gra- 
dient-corrected electron density functional description of the interatomic forces, after Ref. [60]. 
Initially a hexa-coordinate inner sphere structure, rigidly constrained, was equilibrated with 26 
additional water molecules by Monte Carlo calculations using classical model force fields. The 
structure shown, with n = 4 water ligands, resulted 125 fs later and remained the dominant inner 
shell complex for the duration of the simulation. The bonds drawn in identify water oxygen atoms 
within 2.65 A of the Li + ion. 



interest. As before, we focus here on the conditions of infinite dilution in the 
aqueous solution. Again, the first two terms of Eq. (1) produces the partial molar 
entropy of a hard inert sphere of the right size at infinite dilution [90]. We will call 
that contribution sq. Differentiation of the quasi-chemical contributions in Eq. (1) 
with respect to temperature at constant pressure yields the partial molar entropy: 
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RT In 



Ph 2 o 
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(15) 



The middle term accounts for the temperature dependence of the solvent den- 
sity, which brings in the coefficient of thermal expansion for the pure solvent a p 
—(l/V)(dV/dT) p , and then requires the density derivative of the quasi-chemical 
contributions. That density derivative was analyzed above when we considered the 
partial molar volume. Using Eq. (12) produces 
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The developments below establish that the fundamental quantities K n are well- 
defined and therefore the temperature derivative indicated here can be investigated 
with well defined procedures. For the purposes of this specific section and example, 
we will insist on the approximation K n K$ . In that case, the last form of 
Eq. (16) highlights quasi-chemical contributions to standard enthalpy changes for 
the reactions in Eq. (2) because 
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(17) 



with x n the normalized relative concentrations of the Li(H 2 0)+ complexes defined 
earlier in Eq. (7) and AH^ the enthalpies of reaction for the ion hydration reactions 
listed in Table 2. With this identification Eq. (16) then becomes 



lim s Li + w s + (vu 



Olr, 
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v )^ + R\n 
kt 
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+ -J2x n AHi°\ (18) 
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The term associated with the quasi-chemical contributions to the partial molar 
volume here contributes -0.5 cal K" 1 mol -1 to the partial molar entropy. The sub- 
sequent terms can be evaluated from Tables 2 and 3, yielding -16.1 cal K -1 mol -1 . 
Again using the experimental value for He as a solute in water to estimate the non- 
electrostatic contribution produces So = -9.3 cal K~ x mob 1 [78] . 4 The combined 



4 ) The value found in [78] was augmented by R ln[p// 2 oRT] = 14.3 cal K 1 mol 1 , with pn 2 o = 
1 g/cm 3 , to transform to the standard state adopted here. 



value for the partial molar entropy of lithium ion is -25.9 cal K 1 mol 1 compared to 
experimental values of -38.5 cal Kr 1 mol" 1 [66] and -40.1 cal Kr 1 mol" 1 [67]. 5 This 
calculated partial molar entropy is less negative than the experimental value. Our 
neglect of the outer hydration shells may explain that difference. In this context, 
however, the comparison given by Friedman and Krishnan [67] of the hydration 
entropies of K + + Cl~ and Ar + Ar is particularly relevant: the hydrophobic 
contributions to these negative entropies of hydration can be the principal contri- 
butions. Here that hydrophobic contribution has been estimated crudely using the 
He results as a model. 

It is striking that this careful account of the inner hydration shell structure, with 
the complete neglect of effects of the more distant medium and thus neglect of 
dielectric properties, produces such a substantial part of the experimental partial 
molar entropy. It is similarly striking that the quasi-chemical contribution to the 
partial molar volume, and specifically n, is insensitive to the dielectric model uti- 
lized. This suggests the hypothesis that the popular dielectric continuum models 
can be satisfactory with some empiricism for the hydration free energies but such 
dielectric effects contribute only secondarily to variations in hydration free energies. 



This section gives a detailed derivation of the quasi-chemical pattern of calcu- 
lation for statistical thermodynamics. This derivation discusses a variety of de- 
tails in order to provide additional perspective on the mechanisms of the theory. 
The construction of the derivation depends on two basic ingredients: the poten- 
tial distribution theorem and a clustering analysis. These developments culminate 
in a generalization of the thermodynamic model above and a quasi- chemical rule, 
Eq. (41), that is then given a succinct direct proof. 



where p a is the density of molecules of type a (the 'solute' under consideration), 
z a = exp(0n a ) is the absolute activity of that species, q CT is the single molecule 
partition function for that species, and V is the volume. The indicated average 
is the usual test particle calculation: the average over the thermal motion of the 
bath of the interaction potential energy between the test particle and the bath. A 
virtue of this result is that the indicated average is similar to a partition function 



QUASI-CHEMICAL THEORY 



Potential Distribution Theorem 



According to the potential distribution theorem [91] 




(19) 



5 ) The value found in [67] was adjusted by R ln[RT/(l atm liter/mol)] = -6.4 cal K" 1 mor 1 to 
convert to the standard state adopted here. 



but local in character. Thus all the concepts relevant to evaluation of a partition 
function are relevant here too, but they typically work out more simply because of 
the local character of this formulation. 

The potential distribution formula, Eq. (19), can be directly generalized to treat 
conformational problems: 



P.(l) = 4° ) (l)(e- /3AC7 ) ni ^ (20) 



/ 0,1 

where s^(l) is the isolated molecule distribution function over single molecule 
conformational space, including translations and rotations. 6 This function is nor- 
malized so that 

/ s^(l)dl = 1. (21) 

The additional subscript on the (. . .) 1 brackets in Eq. (20) indicate that the av- 
erage is obtained for the solute in conformation 1. This generalization suggests a 
notational simplification according to which Eq. (19) can be expressed as 

P , = ((e^ AU )) o zAqjV). (22) 

Now the double brackets ((• • -)) indicate the average over the thermal motion of 
the solute and the solvent under the conditions of no interaction between them, and 
the averaged quantity is the Boltzmann factor of those interactions. The average 
indicated here is the ratio of the activity of an isolated solute, PaV/q ai divided by 
the absolute activity, z a , of the actual solute. In the developments that follow we 
will focus on the resulting feature 



e-^ u )) Q = e~^\ (23) 



performing series expansions and otherwise analyzing this average. 



Clustering Analysis 

The second basic ingredient to deriving the quasi-chemical pattern is a counting 
device used to sort configurations according to proximity. It can be motivated 
by focusing again on the averaged quantity in Eq. (23). If AU were pairwise 
decomposible we would write 

e -/^=n(i +/«*), ( 24 ) 

3 

6 ) It is worth emphasizing in the context of dielectric models that (e~ l3AU ) a 1 involves the charge 
distribution of the isolated molecule, not the 'self-consistent' charge distribution that has relaxed 
to accomodate interactions with the solution external to the solute. 



where i aj is the Mayer f-( cluster) -function describing the interactions between the 
a solute and the solvent molecule j [93] . Series expansions then proceed in a direct 
and simple way for this case. The clustering features of these expansions are valu- 
able and we can preserve them in cases that don't present pairwise decomposable 
AU by writing 

i=n( i +^+^)- (25) 

3 

b a j is one (1) in a geometrically defined crj-bonding region and zero (0) otherwise; 
b a j is an indicator function [94] or an 'inclusion counter.' i a j is then defined by 

Uj = -baj. (26) 

With this setup the i a j is entirely analogous to a Mayer f-function for a hard object 
and can play the same role of monitoring the description of packing effects in liquids. 
Fig. 1 shows the natural definition of the bonding region for the application to 
spherical atomic ions: b a j is one (1) inside the sphere that includes the inner shell 
of water molecules whereas i a j is negative one (-1) there. Both b a j and f a j are zero 
(0) outside of this bonding region. 

The strategy for our derivation will be to insert this 'resolution of 1,' Eq. (25), 
within the averaging brackets of the potential distribution theorem, then expand 
and order the contributions according to the number of factors of b a j that appear. 
We emphasize that physical interactions are not addressed here and that the 'hard 
sphere interactions' appear for counting purposes only. 



Low Order Contributions 

Let's note some of the properties of this expansion and the terms that result. 
Consider initially the term that is zeroth order, no factors of b a j appearing. That 
term is 

th order in b aj + f aj ). (27) 

3 

This would be the interaction potential energy for the bath with a hard object that 
excludes the bath from any bonding region because then f CTJ - =-1 and the statistical 
weight of Eq. (27) would be zero. Next 

I s * order in b aj : Nb al J[ (1 + f oj ). (28) 

i>2 

There is just one bath/solvent molecule in the bonding region and N possibilities 
for that specific molecule due to the existence of N solvent molecules in the system. 
All other molecules are excluded from the bonding region. Next 



2 nd order in b aj : j balba2 JJ (1 + ( 29 ) 

Now there are two bath/solvent molecules in the bonding region, that pair could 
have been chosen in N(N-l)/2 ways and all other molecules excluded. 



General Term 

The pattern of these contributions to the cluster expansion is obvious. If we 
take the general formula for the n th order expansion term and now include the 
Boltzmann factor for the full system of solute and solvent molecules, then we can 
illustrate the equivalence of two different views of the total system. In one view 
the system is divided into one solute and the set of iV solvent molecules compared 
to an alternative view in which the solute might be a cluster. In the cluster view, 
the solute is surrounded by n solvent molecules in the bonding region with the 
remaining N — n solvent molecules occupying the external region. Including the 
interactions, the general term takes the form 

n th order mt,: S f(l)e-«^ ( N )f[b*n U (1 + faj) 

V U J i=l ^{l,-,n} 

= f^)^ (0) (l + n)e-^-"H^ ( N ) n (1 + /* 



(30) 



Although the expression on each side of this equation contains the full Boltzmann 
factor for the complete N+l molecule system, the AU on each side differs. The 
final AU is the difference between the potential energy of the N+l molecule sys- 
tem and the energies of the separate N-n and (1+n) systems. Thus the final AU 
expresses the interaction energy between the separate cluster and external solvent 
systems. None of the energies considered here need be pairwise decomposable. 
<1<tW„ is the canonical partition function of a cluster of n solvent molecules with the 
a (solute) molecule in a volume V at temperature T. In the Li + (aq) example, the 
Li(H20)„ + are the clusters and qii(H 2 o) + are the cluster partition functions. Sim- 
ilarly, s a w n ^(l + n) is the canonical configurational distribution for the complex 
at temperature T. 7 



7 ) A more technical observation: In the last member of Eq. (30), n! is asserted to be the symmetry 
number for the complex. This classically reflects the molecule exchange symmetry of the ligand 
molecules. This is correct if the ligands are identical to one another and different from the solute 
that serves as a nucleus of the cluster. It is also correct if the ligands and nucleus are the same 
species because of the structure of the bonding Boltzmann factors requires the cluster to be of 
star type and the nucleus is thus distinguishable from the ligands on that basis. Finally, the 
same formula ultimately results also if the ligands are not all of the same species. The symmetry 
numbers (particle exchange symmetries) of the individual molecules aren't involved. 



It is convenient and natural to express the averaging required by the basic poten- 
tial distribution formula of Eq. (20) as a product of averages for the two separate 
systems. Since the Boltzmann factors for the separate systems are identified in 
Eq. (30), we can do this merely by factoring the correct normalizing denominators 
into the proper places: 

n th order contribution to ( ( e _/3At/ ) ) : 



n\q„w n \ 1 1 -8AU (N\ ^ „ , , A \ ( Q{N - n){N - n)\ 



e- pAU { N n ) n 



~" AU n (i + /*;)) ) *rq*wjq*, (si) 

J#{l,...,n} / J n 

noting that z n = Q(N — n)/Q(N) are absolute activity factors for the solvent 
molecules. The brackets ((. . .) ) indicate the average with the distribution that is 
the product of the distributions for the bath and for the complex with n ligands in 
addition to the solute. Finally, we define an additional quantity 

Po = ( II (1 + /*;)) • ( 32 ) 

W{l,...,n} / 

Po is independent of n in the thermodynamic limit for finite n. The quantity aver- 
aged is zero for any solvent configuration for which some solvent molecule penetrates 
the defined proximal volume of the solute. In the simplest examples, such as the 
Li + ion example above, this volume does not depend on solute conformation. In 
general, however, po can depend on solute conformation. Consider, for example, an 
extended solute with two or more hydrophilic sites where that the proximal volume 
is defined to be the union of spheres centered on each hydrophilic site. In any case, 
with knowledge of po we can incorporate the exclusion factors of Eq. (31) into the 
bath distribution and write 

n th order contribution to (( e_/?AC7 )) : 

[l>, (e-^Y) z n q aW Jq a , (33) 



01 n 



The superscript * indicates that the bath distribution function now rigidly excludes 
additional solvent molecules from the defined volume. 

Collecting these results and rewriting Eq. (23), finally we get the structure of a 
partition function, 

^ = E (Po (e-" AU ):) z n q<rwjq«, (34) 



e 

. ' ' 0/ n 

n>0 



as intended. The systems described by this partition function, however, do not 
constitute a familiar statistical thermodynamic ensemble. Instead the system is an 
open microscopic volume defined relative to a specific molecule of type a. Addition- 
ally, the energies involved are energy differences. The calculation identifies f3A/j, a 
as the thermodynamic potential determined by this partition function directly. 
This result is directly comparable to pedagogical depictions of Bethe- Guggenheim 
approximate treatments of order-disorder theory [95]. 

Quasi-Chemical Approximation and Generalization 

In a previous publication [68], a specific quasi-chemical approximation was iden- 
tified that allowed a simplified expression of the interaction part of the solute chem- 
ical potential in terms of the solvent densities, rather than activities. A general way 
to achieve this result is to apply Eq. (23) to the ligand (water) species, and use the 
resulting expression to eliminate the activities z in Eq. (34). If we assume that po 
is independent of conformation and that the resulting ratios of bath contributions 
equal unity, 8 then we obtain precisely that quasi-chemical approximation 



The equilibrium ratio for the aggregation reaction where n ligand (water) molecules 
associate with the a species, as in Eq. (2) with Li + ion, evaluated without consid- 
eration of the effects of the more distant medium is K^ -* . That evaluates naturally 
to one (1) when n=0. This is a first opportunity to compare the results of these 
theoretical considerations with the proposed form of the solute chemical potential 
in Eq. (1), where medium effects on the clusters are included. In fact, the example 
results of Eqs. (13) and (18) utilized this approximation. 

For the problems of interest here, long-ranged external-cluster interactions should 
be included, even though the assumption that po is independent of solute conforma- 
tions can be retained a while longer. Since we are in full possession of the formally 
complete quasi- chemical result, Eq. (34), we can restore the correct factors to the 
approximate result in Eq. (35) by defining [73] 



Here the factor in the numerator still refers to cluster and external solvent interac- 
tions while the denominator factors refer to the ligand (water) and external solvent 
interactions. The generalized expression for the solute chemical potential becomes 



g-iSA/i, 
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(35) 



n>0 




(36) 



8 ) This second assumption would be credible for systems with short-ranged forces away from 
a critical point and is the heart of the quasi-chemical approximations developed for critical 
phenomena. 



e- 13 ^ = Po J2K nPH2 o n . 

n>0 



(37) 



The bracketed ratio in Eq. (36) should be amenable to physical approximation 
because it should be insensitive to details of intermolecular interactions at short- 
range. One possible way to estimate the value of the ratio is with the common 
dielectric continuum model [73]. Using this model, we notice that the factors in 
the denominator are similar to contributions that will be involved in the numerator 
of the ratio. The contributions that aren't balanced in this way are of two types. 
The first of these are contributions associated with the nucleus of the cluster; but 
that nucleus is typically well-buried by a coating of ligands. The second unbalanced 
contribution is associated with through-solvent interferences between different lig- 
ands. Thus, the interactions that aren't balanced in this desirable way are all of 
somewhat longer range than the ones that are balanced. These arguments are of 
just the type that might be presented to support the classic quasi-chemical approxi- 
mation of Eq. (35). But if a physical approximation for external-cluster interactions 
is available — as with the dielectric models — then the results should be insensitive 
to details of the implementation, even when the quantitative contributions aren't 
negligible. In fact, this proved to be true in the lithium example. There it was 
found that increasing; the radius assumed for the Li + ion from R Li +=2.0A to 2.65A 
increased the predicted hydration free energy by only 1% [60]. 



Generalization of the Thermodynamic Derivation 



The thermodynamic model developed in the previous section, culminating with 
Eq. (9), can be generalized to include longer ranged correlations. This is achieved 
by exploiting the potential distribution theorem, Eq. (19), through the replacement 



(38) 



With this replacement of the ideal solute partition function, the mass-action rela- 
tions in Eqs. (7) and (8) remain valid because the central thermodynamic infor- 
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mation in Eq. (5) is correctly generalized to fi a = RT\n p a V/q a (e — ^ 
potential distribution theorem in Eq. (19) indicates. The equilibrium ratios K„ 
now have the interpretation as the actual, observed ratios established in solution 
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Finally we obtain the generalized expression for the interaction part of the chemical 
potential, 



[3A/j, Li + = - In 
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(40) 



We have elaborated the notation to clarify how the remaining Widom factor refers 
to the solute with zero ligands, as defined in the first logarithmic term. The av- 



bare solute and should be well described with the aid of dielectric continuum mod- 
els. It should be compared with the po term of Eq. (37) that was constructed by 
consideration of 'cluster interference' issues [68]. The difference in the structure of 
this result compared to Eq. (1) is a consequence of arranging the expression so that 
the actual, observed chemical equilibrium ratios K n appear here. 

Insisting on this latter point produces the remarkable quasi-chemical rule 



where x$ refers to the relative concentration of the n = solute cluster, as defined 
in Eq. (7) but with the actual equilibrium ratios K n utilized. This last form is 
more elegant than the previous expressions for the interaction contribution to the 
solute chemical potential. For simulation purposes, however, it is unlikely to be 
of direct practical use because xo will be difficult to determine with statistical 
precision from simulation observations. A virtue of the earlier result in Eq. (37) 
was its constructive character, which emphasized how these hydration problems 
could be broken down into smaller component problems that individually might be 
more susceptible to approximation. Nevertheless, this quasi-chemical rule is a new, 
general, and strikingly simple expression. 

Of course, the Xq indicated in this quasi-chemical rule, Eq. (41), must still be 
well-defined, therefore the recognition problem that was central to the technical 
discussion above is still essential. Nevertheless, the flexibility offered by the defi- 
nition of the bonding indicator functions, b a j, should be used in designing quasi- 
chemical approximations. The quasi-chemical rule expresses the compromise that is 
sought in that design. If those bonding regions are defined to be large, the average 
(e _|8Af7 n j(l + faj))o should be simple because macroscopic approximations should 
suffice. But then x would be correspondingly difficult to determine. In contrast, 
if those bonding regions are aggressively defined to be small, the determination of 
xq would be simpler, but then the exclusion average becomes nearly as difficult as 
the original problem. A compromise should be sought that makes each of these 
component problems manageable. 

Direct Derivation of the Quasi-chemical Rule 

The quasi-chemical rule, Eq. (41), is so simple that a direct derivation is called 
for. In view of the potential distribution theorem in Eq. (19), a more succinct 
statement of the quasi-chemical rule would be 
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x = 



The two indicated averages involve the same sample and the same normalizing 
denominators. The ratio is the average 




which includes the solute actually present, not as a test particle, and involves the 
full solute-solvent interactions. The quantity averaged, l\j(l + f a j), is one (1) if no 
solvent molecule penetrates the defined inner shell and zero (0) otherwise. Thus, 
xq of Eq. (43) is indeed the fraction of Li + ions with zero inner shell neighbors. 
This identification together with the potential distribution theorem of Eq. (19) 
then gives a direct derivation of the quasi-chemical rule, Eq. (41). This derivation 
justifies the name quasi-chemical rule because Eq. (41) is a simple, directly proved 
result that permits regeneration of all the quasi-chemical approximations discussed 
earlier. 

Though our earlier discussion noted the analogy of the first term on the right 
of Eq. (41) with the -lnpo of Eq. (1), it is worthwhile to consider an alternative 
analogy. The xo of the quasi-chemical rule can be interpreted probabilistically. 
This suggests that the information theory tools that have recently been applied 
to the analysis of po [64,90,96-100] might be usefully applied to understanding x 
also. Nevertheless, those are physically distinct problems. 

AN EXPLICIT-IMPLICIT SOLVATION MODEL 

Next we drive specifically toward a simulation procedure that permits treatment 
of the bulk of the solution in an implicit solvent fashion. As a beginning we note 
some broad points relevant to this goal. We do not intend to eliminate all the 
solvent molecules. It is clear that in many cases the solvent water molecules par- 
ticipate directly and individually on a molecular basis. In the interest of physical 
directness of the description, those molecules shouldn't be eliminated. The models 
we seek are then explicit-implicit hydration models. With a few important water 
molecules explicitly in the calculation, we will be satisfied with rough theories, such 
as dielectric models of the aqueous solution, for the solution more distant from the 
explicit action. 

Another important point is that we want to carefully limit the possibilities for 
explicit water molecules. This is more than merely an economic consideration. 
We want the thermal distributions of those explicit water molecules to be simple 
because it is the entropic aspects of these problems that are the least clear. Untem- 
pered structural fluctuations would be an undesirable feature of the implicit solvent 
models that we seek. The introduction of complications intended to make implicit 
solvent models more realistic, such as flexible external boundaries, can make more 
difficult the theoretical issues and the problem generally. The models we envision 
will be successful to the extent that only a few explicit water molecules with sim- 
ple possibilities for distribution must be treated and when rough theories will be 



satisfactory for the implicit solvent portions of the system. Our final point here is 
that we want to embed these models in a theoretical structure that can be used to 
analyze and test the limitations of the inevitable approximations. 

The idea for how to use the quasi-chemical development of Eq. (34) as an explicit- 
implicit solvation model is suggested by Fig. 4. Proximal volumes are defined 
around sites of the macromolecule that require specific care in treatment of neigh- 
boring water molecules. Referring to the formal treatment above, this definition 
implies specification of the bonding functions and that definition does not re- 
quire that the protein structure be rigidly constrained. As the theory is applied, 
therefore, the protein structure can respond to specific features of the complexa- 
tion. Since the volumes permitted for specific hydration are limited and sharply 
defined, the possibilities for problematic fluctuations of the structure of the complex 
are limited. Furthermore, the possibilities for a troublesome broad distribution of 
ligand occupancies are also limited. 

In these complicated settings, it is worthwhile to notice that the theoretical de- 
velopment can be expressed in a way that is consistent with understood simulation 
techniques. We consider a specific n th term of the sum of Eq. (34) and then the 



FIGURE 4. How the quasi-chemical approach might provide an explicit-implicit model of 
macromolecule hydration. The proximal volume is defined about a hydrophilic (carboxylate) 
side chain of a protein. The 'W's indicate water molecules that might occupy the defined bonding 
region. The material external to the protein plus ligand complex might be treated by a crude 
physical model such as a dielectric continuum model appropriate for longer-ranged and weaker 
interactions. 




inner-most average expressed there: 



s aW J°\l + n) Po (e-? AU )l z n q aWn = e -^ B d + n) (£) . (44) 

This equation introduces the conformational free energy J~aW n (1 + n ) associated 
with a geometry of the solute and n specific solvent (water) molecules within the 
defined proximal volume. The factor n! on the right arises because n\ q a w n is the 
normalizing feature of s a wj°\i- + n) involved in Eq. (30). Our quasi-chemical 
master formula, Eq. (34), can then be expressed as 9 

7 L = V- 1 Y.QTr n (e-^). (45) 

n>0 V a - 7 

Tr n (. . .) indicates the trace as it might appear, for example, in the conventional 
evaluation of a canonical partition function. That term evaluates to n! q a w n — the 
'configuration integral' [101] for the isolated cluster — in the case that the external- 
cluster interactions are negligible. The formula Eq. (45) suggests a grand canonical 
ensemble calculation but with some primitive and essential differences. Firstly, the 
volume of the system is attached to the solute and, in fact, this volume can change 
as the solute changes conformation. Secondly, the function appearing where the 
potential energy would appear is the conformational free energy T cWn {\ + n). This 
depends on the thermodynamic state of the solvent, that is, on z and (3. When 
z=0, only the n=0 term of Eq. (45) is non-zero and then Eq. (45) is trivially verified 
with ( 5A/i cr =0. Finally, the thermodynamic identification of this partition function 
is the combination on the left side of Eq. (45). 

Despite these important differences, the structural similarity of Eq. (45) with 
a grand canonical partition function is remarkable. Calculations that used this 
approach would be built upon a model for T a w n {^- + n )j which included physical 
approximations for the extra-cluster contributions, and on the simulation algo- 
rithms for calculations on grand canonical ensembles. Grand canonical simulation 
calculations are less routine than other types of simulations and probably more 
demanding when applied to condensed phases. If grand canonical simulations were 
prohibative in a particular case because of a lack of facility of molecule exchange, 
then these approaches might also be prohibitive because of that. Practical points 
of difference might help, however. The present development is designed so that 
the number of ligand molecules needn't be great. Additionally, the conformational 
changes of the solute can change the defined proximal volume, changing the local 
ligand binding characteristics either to squeeze out ligands or to open space for 
insertion of new ligands. 

Again we emphasize the idea that if the proximal volumes are defined to be ag- 
gressively small then the evaluation of the partition functions indicated in Eq. (45) 



9 ) p/z is a standard combination of variables for reasons that Eq. (19) makes clear. It is also 
common to employ the notation ( a = z a q a /V so that p a /( a —* 1 when interactions are negligible. 



should be simpler. In the most favorable case, the indicated partition functions 
might be evaluated by a maximum term and gaussian distribution procedure. In 
that case, the computational work to exploit this proposal is only slightly greater 
than a determination of an optimum hydration structure for the ligands. Roughly 
put, the theory provides a justification for treating a small set of waters as part 
of the solute molecule under study. The theory then suggests consideration of the 
'chemical' reaction for isolated reaction species. According to this approach, the 
study of the ligands and of the complex, then inclusion of a simple implicit solvent 
model permits an inference of the basic hydration free energy for the solute alone. 

DISCUSSION 

In the area of computational theory for aqueous solution chemistry and bio- 
chemistry, there seems to be a well developed folklore that judicious inclusion of 
a pivotal few water molecules is typically the important next step beyond dielec- 
tric continuum implicit solvent models. The quasi-chemical approach developed 
here is the statistical mechanical theory for how to do that properly for solution 
thermodynamics. 

Implicit Solvent Models 

One goal of an implicit solvent model is to increase computational scope. Other- 
wise inaccessible problems might become accessible to study. Or, perhaps, hundreds 
of instances might be checked with implicit solvent models where only one instance 
could be examined if sufficient water must be explicitly included. This increased 
scope is accomplished, in concept at least, by reducing the amount of solvent that 
must be explicitly tracked in a simulation calculation. The prices to be paid for this 
reduction are the approximations that must be accepted and the increased com- 
plexity of the implicit calculation. How the advantages and complications balance 
in particular cases typically has been unclear. 

Nevertheless, there are other potential advantages of implicit solvent models and 
the chief of these is the same advantage offered by approximate physical theories. 
Scientific use of such theories serves to establish simplified concepts that are valid 
and valuable where the goal is understanding as well as predicting. The physi- 
cal idea implicit in the developments above is that a handful of proximal water 
molecules are decisive for chemistry and biochemistry in aqueous solution. Even 
for such extended molecular events as protein folding, it is a valuable hypothesis 
that a small number of water molecules, perhaps hundreds rather than a hundred 
times that, play a decisive role. 

The quasi-chemical factors of pu 2 o n of Eqs. (1) and (35) are the most basic fea- 
ture of the environment mean-field operating on the solvent molecules included 
explicitly. This identification is a basic difference from other implicit solvent mod- 



els and suggests the possibility of aggressively pushing the numbers of explicitly 
included solvent molecules down to minimal chemically reasonable values. 

Dielectric Models and Electronic Structure Calculations for 

Solution Thermodynamics 

Explicit treatment of water molecules that are near-neighbors of the solutes 
should mitigate [102] the awkward, not always well-defined invocations of "dielec- 
tric saturation and electrostriction." In this way quasi-chemical developments also 
address lingering foundational problems with dielectric models in applications to 
solution chemistry. Parameterization of dielectric models to describe variations of 
hydration free energies with temperature, pressure, composition, and solute con- 
formation [69-73] are as relevant as the foundational issues. For the quasi-chemical 
models used here, some empirical parameterization is still required for the ligands 
that form the exterior of the complex. In the Li + example, this means radii pa- 
rameters are required for the water molecules involved. However, the results are 
insensitive to radii parameters required for the solute Li + ion; and furthermore, as 
the theoretical development emphasizes, the product of such efforts is the thermo- 
dynamic characteristics of the solute, not the ligands. The development above has 
emphasized that much of the thermodynamic state dependence of hydration free 
energies is explicitly available in quasi-chemical approaches. 

Similarly, phenomena involving the electronic structure of the solute are natu- 
rally, though approximately, included in quasi-chemical theories. Examples of such 
phenomena are nuclear versus electronic time scale issues, dispersion interactions 
and electron correlation more generally, the orthogonality or 'charge-leakage' is- 
sues that are currently discussed for dielectric models, and solute-solvent charge 
transfer that was observed for the Li + example, see Table 1. They may be ex- 
cluded in unvarnished dielectric models, or included in only ad hoc ways. Since it 
is the molecular and thermodynamic properties of the solute that is the goal of the 
quasi-chemical theories, the phenomena listed above are reasonably included for 
the solute properties. 

CONCLUSIONS 

This paper has given an extended development of the quasi-chemical approach to 
computational solution chemistry and biochemistry and discussed the relevance of 
this development to problems of implicit solvent models of the structural molecular 
biology. The new implicit solvent model presented here is, more accurately, an 
explicit-implicit model applicable to the statistical thermodynamics of solutions. 
The physical idea for the model is that a small subset of solvent molecules are 
decisive for the statistical thermodynamics and require explicit treatment, whereas 
the remaining solvent molecules are less important and may be treated implicitly. 



The formal derivation of the quasi-chemical formula depends on the well-known 
potential distribution theorem aided by a clustering analysis that serves as a count- 
ing device. In the process of the derivation, it is shown how the view of the system 
is transformed from a single solute in solution to a solute complexed to solvent 
ligand molecules interacting with the extra-complex solution. The thermodynamic 
derivation and the quasi-chemical rule that results, Eq. (41), are new. 

To illustrate the quasi-chemical approach, the hydration thermodynamic prop- 
erties of Li + (aq) are calculated. The problem is framed as a series of ion hydration 
reactions involving the isolated ion, water molecule ligands, and ion-ligand com- 
plexes. Application of an implicit solvent model to the species, the standard dielec- 
tric continuum model in this case, accounts for interactions with the extra-complex 
solvent molecules. The results predict that Li + in liquid water is surrounded by 
four (4) inner shell water molecule ligands with a chemical potential comparable 
to experimental results. Since the explicit pressure and temperature dependencies 
of the chemical potential are analytically available in the quasi-chemical formula, 
results for the partial molar volume and entropy of Li + (aq) are explicitly avail- 
able. The evaluations of the partial molar volume and partial molar entropy of the 
Li + (aq) give a snapshot of the status of molecular theory for these properties. 
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